## Overview

This project is the code of meaningful learning (MeanLearn), which aims to advance abstract reasoning in LLMs via generic fact guidance.



## Structure

```bash
--preliminary # this folder contains all the script of preliminary, including inference (run_llama2.py, run_orca2.py), evaluation (process_result.py).
--data # this folder contians the AbsR data
--clustering # this folder contains the clusters of OOD benchmarks such as MMLU, which is used to calculate AbsAcc.
--abstract_performance_clustering.py # the evaluation script to calculate AbsAcc
--performance_bbh.py # the script to compute vanilla accuracy on BBH benchmark
--perfrmance_classification.py # the script to compute vanilla accuracy on other benchmarks, such as MMLU
--reason_instant.py # the script to conduct inference with baselines or MeanLearn
--tools.py # this script contains the functions for diverse usages
--train.py # the script to train MearnLearn
```



## Train MearnLearn and Inference

```bash
python train.py # train MearnLearn

python reason_instant.py # running inference for evaluation
```



## Evaluate

```bash
python perfrmance_classification.py # evaluation on benchmarks except BBH, such as MMLU, AGIEval
python performance_bbh.py # evaluation on BBH
python abstract_performance_clustering.py # compute the AbsAcc
```

